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Angular clustering of photometrically classified quasar 
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ABSTRACT 

The angular clustering of 230,829 photometrically selected quasar candidates from 
SDSS NBCKDE catalogue with photometric redshifts within the range 0.8 ^ z p hot ^ 
2.2 is studied with the help of the angular two-point correlation function. For this 
purpose own technique of the random catalogue generation was investigated and used. 
The obtained angular 2pCF of photometrically selected quasars within 0.6' — 40' scales 
is fitted well with the power-low w{6) — (9o/0) a with parameters 9q — 2.3~t ' g arcsec 
and a — 0.87±0.06, that agree well with previous studies of earlier releases of this 
catalogue, as well as with the results on clustering of X-ray point sources which are 
mostly active galactic nuclei. Investigation of the sample showed that except the well- 
known stellar contamination of photometrically selected quasar candidates there is also 
a small (about 0.1%) contamination by artifacts of the automatic selection technique of 
point-like sources, like star formation regions in spiral galaxies or parts of interference 
crosses of bright stars. 
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1 INTRODUCTION 

Quasars as the brightest extragalactic objects play an 
important role in study of the large-scale structure of the 
Universe. They are the only observational source of informa- 
■ tion about inhomogeneity and clustering of matter (includ- 
ing the dark matter) at cosmological redshifts. The most 
powerful results on quasars clustering were obtained us- 
ing two largest quasar surveys up to date: the 2-degree 
Field (2dF) Quasar Su rve^ with 2QZ catalogue as a re- 
sult (|Croom et alJll998h and the Sloan Digital Sky Survey 
(SDSSfl the sec ond stage of which has been completed with 
the 7th release l|Abazaiian et al. 2009). The two-point cor- 
relation functions (2pCF) (|Peebleslll980r ) of quasars £(r) are 
important characteristics of matter spatial inhomogeneity 
that may be compared to cosmological theories of structure 
formation. Development of 2dF and SDSS surveys gave a 
powerful incentive to investigation of v arious aspects of cor - 
relation functions of quasars (see, e.g.. lCroom et alj l|2005h ; 
iRoss et "all (|2009l ) and references therein). 

The study of the matter distribution with the help 
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of extragalactic redshift surveys comprises two main prob- 
lems. The first one is related to the fact that sur- 
veys of extragalactic objects give us an information only 
about distribution of the luminous matter which is b i- 
ased relative to the dark matter l|Dekel fc Reesl Il987l k 
Moreover the biasing may depend on the physical pe- 
culiarities of_extragalacJik2ob^ morphologi- 
cal type (Einasto et al. 112007): IRoss et all 120071'), lum inosity 
ijBeisbartfc Kcrschcr 200ol: ISorrentino ct al. 20 0^), color- 
index 1 Coil et all l2008l k star formatio n rate (lOwers et al.l 
etc. and evolves with redshift (ICroom et al.l 2005; 
Weinstein et alj 12004 iPorciani et al.l 120041: iMvers et all 
l200fj . l2007l : lda Angela et al. 2008; Mou ntrichas et al.ll200^ r 



The second problem of the 3-dimensional analysis of the 
distribution of extragalactic objects relates to the observed 
redshifts of these objects. These redshifts are the only tool 
for measuring distance between the objects on cosmological 
distances, but they are 'contaminated' by measurement er- 
rors and non-Hubble motions. This results in the so called 



redshift-space distortions, namely Kaiser ( 1987) and 'Finger 
of God ' effects (see, e. g., Ross et al . (2009); da Angel a et al.l 
( 2008k iMWitrichas et al. I l|2009h ; llvashchenko et al. I (|2010l ) 
and references therein). 

To avoid the problem of the redshift-space distortion the 
projected, spatial w(a) and angular w(6), 2pCF are usually 
used. In the first case the projections a of the 3-dimensional 
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distances r on the plane perpendicular to the line of sight 
are used, and the parameters of the 3-dimensional 2pCF £(r) 
can be reconstructed from the parameters of w(a). Here the 
assumption that the influence of redshift inexactness causes 
a negligible effect on a comparing to r is usually made. In 
the second case only the angular distances on the sky plane 
are used, and the parameters of £( r) can be rec onstructed 
from w(6) using Limber's equation (Limber f953). This pro- 
cess of deprojection of the 2pCF is much more complicated 
than in the first case because it comprises the usage of the 
redshift distribution function of objects and analytical func- 
tions describing the clustering evolution. Nevertheless, tak- 
ing into account smaller sizes and larger volumes of quasar 
spectroscopic surveys comparing with the galaxy ones, the 
angular 2pCF is still appealing for quasars because it does 
not include redshift information and thus allows to use cat- 
alogues of photometrically classified quasars, which contain 
one order of magnitude larger number of objects than spec- 
troscopic quasar catalogues. 

The angula r 2pCF of q uasars was previously studied 
by iMvers et all l|200fj . 120071 1 on the catalogues of photo- 
metrically selected quasar c andidates from SDSS NBCKDE 
|Richards et alj|2004 l2009h based on the Early and the 4th 
data releases of SDSS. They showed that the angular 2pCF 
of quasars is fitted well with a power low w(9) — (6o/9) a 
and noticed two breaks in this power low on scales ~ 1' and 
~ 25', where a is the slope of the angular 2pCF. Some stud- 
ies were carried out also on the angular 2pCF of the lumi- 
nous red galaxies (LRG) which are the farthest galaxies seen 
with a peak of distribution around z w 0. 5 and 2-point cross- 
corre l ation function of quasars and LRG dMountrichas et all 
120091 ; iRoss et al.ll2007l ; ISawangwit et alj|201lf ). It is also in- 
teresting to compare the results of the angular clustering of 
quasars with those of X-ray point-like sources, a sufficient 
part of which is supposed to be active galactic nuclei (AGN). 

In this paper we present the results on our study of the 
angular 2pCF of photometrically selected quasar cand idates 
from SDSS NBCKDE catalogue (jRichards et al.ll2009h . The 
sample, its selection and properties, along with the tech- 
nique of the angular 2pCF calculation and the random cat- 
alogue generation method are discribed in Sec. [2] The re- 
sults together with their discussion are presented in Sec. El 
Finally, in Sec. [4] we sum up the results. 



2 THE SAMPLES AND THE TECHNIQUE 

2.1 Angular 2pCF 

According to lPeebles! |l980l ) the angular 2pCF w(9) of 
the objects distribution determines the probability dPi to 
find simultaneously two objects with positions inside small 
solid angles dQ,\ and dQ.2 on a unit sphere with the angular 
separation 9 as 



w H {9) 



4nn r 



DD(9) RR(9) 



(n- l)(n P - 1) DR{9) DR{9) 



dPi = n [1 + w(9)] dQid^ 



(1) 



where n is a number density of objects for a given sample. In 
practice the angular 2pCF of objects is calculated using the 
numbers of objects pairs with different separations. There 
are four estimators known in a literature: 

n r (n r - 1) DD(9) 2(n r - 1) DR(9) 
w LS (9) = -r- DD/fl , + 1; (2) 
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(5) 



n(n - 1) RR(9) 



RR(9) 



- 1 DR(9) 

which are Lan dy-Sza lay (|Landv fc Szalavl 1 19931), Hamil- 
ton l|Hamiltonl Il993j ). Peebles-Hauser dPeebles fc Hauserl 
Il974h and Davis-Peebles (|Davis fc Peeblesll 19831 ) estimators 
correspondingly. Here DD{9) and RR(9) are numbers of 
pairs with separations 9 in initial (data-data) and random 
(random-random) samples correspondingly, and DR(9) is a 
number of cross-pairs between initial and random samples 
(data-random). Normalizing coefficients containing values n 
and n r are included in a case of different numbers of objects 
in initial, n, and random, n r , samples. The random sample 
is a sample, which should reproduce a random distribution 
of objects with the same redshift and angular distributions 
inherent to the initial sample as much as possible. 



2.2 SDSS NBCKDE data 

Our sample is taken from the SDSS NBCKDE Cat- 
alogue of Photomet rically Classified Quasar Candidates 
(|Richards et al.ll2009l ) that contains 1,015,082 quasar candi- 
dates selected from the photometric imaging data of SDSS 
using a non-parametric Bayesian classification kernel density 
estimator (NBC-KDE). The objects are all point sources to 
a limiting magnitude of i = 21.3 from 8417 deg 2 of imaging 
from SDSS Data Release 6. According to authors the over- 
all efficiency (quasars:quasar candidates) of the catalog is 
~ 95%. 

For our study we selected only the objects with pho- 
tometric redshifts within the range 0.8 ^ z p hot ^ 2.2 and 
photometric redshift range probability z v ho tvrob > 0-5. This 
redsh ift range is known as 'SDSS window' jWeinstein et al.l 
|2004j), i.e. the redshift range where the algorythm for pho- 
tometrical selection of quasar candidates is the most effi- 
cient. As the sky coverage of SDSS contains one big 'piece' 
and three narrow near-equatorial 'stripes', we excluded these 
stripes to reduce boundary effects, thus our right ascention 
range is 100° ^ a ^ 270°. The resulting sample (we call it 
full sample throughout the paper) contains 320,761 object. 
Its sky coverage and photometric redshift distribution are 
presented in Fig. [1] and Fig. [2] (solid line) correspondingly. 

The KDE photometrical selection algorythm for quasar 
candidates has its limitations . Base d on classifying simu- 
lated quasars, iRichards et al.l l|2004l ') found that the KDE 
technique is 95% efficient, where the rest 5% is assigned to 
contamination by stars. Beside this, there could be other 
sources of contamination caused by the errors of automatic 
survey. To check these effects we selected first 30,000 objects 
from our sample (about 10% of the sample) and examined 
them by eye using the SDSS SkyServer web-servic^E Among 
these 30,000 objects, 28 appeared to be bright blue parts of 
spiral galaxies, probably star formation regions, 1 is a faint 
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Figure 1. Sky coverage of the full sample in equatorial coordi- 
nates. 
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Figure 2. Redshift distribution of the full, low- and high- 
reddening samples. 
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Figure 3. Sky coverage by the low-reddening sample in equatorial 
coordinates. 
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Figure 4. Sky coverage by the high-reddening sample in equato- 
rial coordinates. 



2.3 Initial and random samples 



extended object, probably galaxy or nebula, misclassified as 
a point-like source ('STAR' according to SDSS nomencla- 
ture), and 4 are parts of interference crosses from bright 
stars. Thus a contamination of the sample by missclassified 
point-like sources is only ~ 0.1%, which can be neglected in 
comparison with the stellar contamination. 

As the main criterion of the KDE photometrical se- 
lection algorythm is based on quasar magnitudes and col- 
ors, that have b een corrected f or Ga lactic extinction us- 
ing the maps of ISchleeel et~aT] l|l99SD . any systematic er- 
rors in the reddening m odel can induce additional effects 
on clustering results fsee lRoss et alj (|2009l) for discu s sion o f 
possible effects). That is why following Ross et ail (2009); 
llvashchenko et all {2010) we devided our sample into low- 
reddening (E(B — V) ^ 0.0217) sample and high-reddening 
(E(B - V) > 0.0217) sample. The numbers of objects in 
these samples are 128,757 and 192,004 for low and high red- 
dening correspondingly, and the sky coverage by them are 
presented in Fig. [3] and [4] As one can see from Fig. [2] the 
redshift distribution of both samples are similar to that of 
the full sample, and the relative distribution of low- and 
high-reddening regions around the sky plane is clearly seen 
in Fig. [3] and H 



The random catalogues were generated with the help 
of the following technique. Firstly the sky area covered by 
the sample was divided into 'squared' cells and then filled 
with the same number of random points (random a and 8) 
as in the initial sample with homogenious distribution along 
a and cos ^-distribution along S. The 'squared' cell means a 
quadrangle with similar number of degrees in a and S, which 
is a trapezoid indeed. This technique has the same idea of 
preserving the original inhomogeneity of the sky coverage 
by the samp l e as t he usage of the imaging mask (see e. g. 
I Myers et al.l ((2006) ) , but unlike that technique it does not 
require any information about conditions of observations. 
An important aspect of our technique lies in a choice of 
the cell size. It should be small enough to reproduce all the 
sample density fluctuations and large enough not to smooth 
the physical clustering of objects. To choose the appropri- 
ate cell size we checked different possible sizes from 1° x 1° 
up to about 17° x 17°. In Fig. [5] the relative fluctuations 
(rms) of the number density from cell to cell as a function 
of the cell side size is presented. As one can see, for small 
sizes fluctuations grow with decrease of the cell size as it is 
expected due to clustering, but on some scales about 10° 
these fluctuations become constant, that means that these 
scales are larger than the scales of inhomogeneity. Fluctua- 
tions around constant on the largest scales are the result of 
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Figure 5. Relative density fluctuations as a fuction of cell side 
size. 



the small number of cells with these sizes. Hence we choose 
the smallest possible size 10° x 10°. 

To avoid the effects which can occure due to usage of 
the samples with sky coverage like in Fig. |31|3J especially re- 
lated to the specific technique of random samples generation 
described above, for studing the angular 2pCF and possible 
influence of the different reddening on the results we se- 
lected three subsamples from the initial sample. The first 
full subsample is a patch of the sky with f30° < a < 240°, 
0° < 8 < 60°, containing 230,829 objects. The second 
subsample is a patch of the sky with 140° < a < 230°, 
30° < 5 < 60°, containing 80,107 objects (71,190 objects, or 
89%, of which are 'low-reddening' quasars according to def- 
inition given in Sec. 12. 2[) and the third one is a patch of the 
sky with 130° < a < 240°, 0° < 5 < 20°, containing 90,185 
objects (79,450 ones, or 88%, of which are 'high-reddening' 
quasars). We formally call these last two subsamples low- 
reddening and high-reddening correspondingly. 

Using the technique described above we generated 20 
random catalogues for each of our 3 samples. Thus in each 
case the values of RR and DR were calculated as the arith- 
metic means of 20 corresponding values. 



2.4 Errors estimation 

We assume that the main contribution to statistical er- 
rors of w(8) is made by dispersion of DD values, because 
usage of 20 random samples for calculation of DR and RR 
are formally equal to usage of 20 times larger samples. Thus 
we calculate a w in the following way: 



2 / dw V 
° w ~ \ ODD 



2 

°DD 



f-V 

\RR) 



2 

"DD, 



(6) 



where we used Poisson errors odd = 



3 RESULTS AND DISCUSSION 

In Fig. [6] the angular 2pCF of the full subsample cal- 
culated using four different estimators ©-([5]) are presented. 
As one can see, all of them agree well with each other on 
scales less than ~ 80 arcmin. The reason of differences on 
larger scales, except the obvious ones (different functional 



--- Peebles — Hguser 

— Landy— Szalay 
Hamilton 

— Davis — Peebles 



3 10 




0.1 



1.0 10.0 

0, a re minutes 



100.0 



Figure 6. Four estimations of the angular 2pCF for the full sub- 
sample. 



dependences of w{6) on DD, RR and DR for different es- 
timators and limitation of their usage), could be a result 
of pecularities of the random samples generation technique, 
namely the fact, that the cell size is only about 6 times larger 
than the scales where the discrepancy occures. 

For further calculations we used Landy-Szalay estima- 
tor as the most conventional one and fitted the 2pCF with 
a power-low 



m = 3 



where 0q is the angular correlation length and a is the slope, 
related to the slope of 3D 2pCF correlation function, 7, as 
7 = 1 + a. 

In Fig.[7]the 2pCF for the full subsample (Landy-Szalay 
estimator) together with the best fits within the whole range 
0.06' — 40' and separatelly for ranges < 1' and > 1' are 
shown. The 1, 2, and 3a levels of likelihood function for these 
fits on the plane {9o,a} are shown on the bottom pannel. 
In Fig. [8] the 2pCF for the low- and high-reddening subsam- 
ples are presented together with 1,2, and 3a levels of the 
likelihood function for best fits on the plane {9o,a} for the 
same angular ranges are shown. The values of parameters a 
and 8q are presented in Table [T] As one can see the values of 
the slopes for three subsamples agree within 3cr-levels, thus 
one can neglect possible selection effects caused by different 
galactic extinction in different parts of survey, moreover the 
redshift distribution of quasars (as it is shown in Fig. [5J arc 
similar for different reddening. 

In Table [2] the slopes of the angular 2pCF from pre- 
vious studies bv lMvers et all (|2006l . l2007h of the phometri- 
cally classified quasars from the previous releases of SDSS 
NBCKDE catalogue are presented. For comparison we als o 
presented here the results on LRG ijSawangwit et alll201ll L 
which are the most distant galaxies seen with large red- 
shift surveys, the cross-correlation function of them with 
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Figure 7. Top: the angular 2pCF for the full subsample (Landy- 
Szalay estimator) together with the best fits. Bottom: 1,2,3<t levels 
of the likelihood function for parameters a and 60 for ranges (left to 
right) 0.06' - 1', 1' - 40' and 0.06' - 40'. 



quasars |Mountrichas et al]|2009l ). and two examples of the 
last studies of X-ray point-like sourc es angular clustering 
|Ebrero et all 120091 : lEMv et all I2OII1 ), more than 95% of 
which are usually co nsidered to b e AGNs. Our results agree 
well with results by iMvers et all (|200rj . 120071 ) at the same 
angula r scales and similar redshift range as far as with the re- 
sults ofHEi iv et alj (|201ll ) for X-ray point-like sources within 
a broader angular scales and with close mean redshift. A bit 
higher slope of the angular 2pCF of X-ray AGNs could be 
explained by the fact that obscured AGNs, which are larg- 
erly missed in op tical surveys, are seen in X-ray (see e. g. 
lElviv et all |201ll ) for discussion). 



4 CONCLUSIONS 

We calculated the angular 2pCF for the subsample of 
230,829 photometrically classified quasar candidates from 
the largest catalogue of these objects up to date, SDSS 
NBCKDE l|Richards et al.ll2009l ). with photometric redshifts 
withi n the range 0.8 ^ z v h ot <k 2.2, known as 'SDSS win- 
dow' l|Weinstein et al.ll2004l ). and photometric redshift range 
probability z p hot P rob > 0.5. For this purpose own technique 
of the random catalogue generation was investigated and 
used. 

The obtained angular 2pCF of photometrically selected 
quasars within 0.6' — 40' is fitted well with the power-low 
w(9) = (6o/6) a with parameters do 
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Figure 8. Top: the angular 2pCF for the low- and high- reddening 
subsamples (Landy-Szalay estimator). Middle and bottom: 1,2,3c 
levels of the likelihood function for parameters a and 9q for ranges 
(left to right) 0.06' - 1', T - 40' and 0.06' - 40' for the low- and 
-reddening subsamples correspondingly. 



a = 0.87 ± 0.06, that agree well with previous studies o f 
earlier releases of this catalogue bv lMvers et all (2006, 2007) 
for photometrically classified quasars from the earlier release 
of same SDSS N BCKDE catalogu e, as well as with X-ray 
point sources by l|Elviv et al.ll201ll ). 

We also investigated our sample and found that ex- 
cept the well-known stellar contamination of photometri- 
cally selected quasar candidates there is also a small (about 
0.1%) contamination by artifacts of the automatic selection 
technique of point sources, like bright blue regions in spiral 
galaxies or parts of interference crosses of bright stars. The 
subsamples with high and low Galactic extinction (redden- 
ing) showed similar results on 2pCF, that means that one 
can neglect the selection effects of the catalogue caused by 
varying reddening when studing quasar clustering. 
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Table 1. The slopes a and correlation lengthes 9q for the full, low-reddening and high-reddening subsamples for Landy-Szalay estimator 
within the ranges 0.06' - 1', 1' - 40' and 0.06' - 40'. 



full low-reddening high-reddening 



a 9o, arcsec a 9o, arcsec a do, arcscc 

0.06' - 1' 0.9 ±0.3 S^l^o °- 9 ±o.4 3 - 2 -2.5 1.2 ± 0.4 ^t^l 

l'-40' 0.93 ±0.10 4-lt?;! 1 - 1 7lo;i5 8.3^3 6 0.93 ±0.14 5.5l|;j 

0.06' -40' 0.87 ±0.06 2.3±o g 1.02±g;gf 4.5 ± 1.3 0.86jl{J;[5t 3.6±i;| 



Table 2. The slopes a of the angular 2pCF of quasars, LRG, and X-ray sources from previous studies by other authors. Here 'phot.' 
stands for photometrically classified objects. 



objects 


9 range 


z 


method 


slope a 


source 


phot, quasars 
(SDSS NBCKDE DR1) 


0.04' 

I 1 - 
2' - 
0.04' 


- V 
-25' 
250' 
- 250' 


ft! 1.4 


autocorr. 


4 +0 - 4 
0.55 ±0.29 
0.98 ±0.15 
0.91 ±0.08 


Mvers et al. (2006) 


phot, quasars 
(SDSS NBCKDE DR4) 


0.16' 


-61' 


« 1.4 


autocorr. 


0.928 ± 0.055 


Mvers et al. (2007) 


phot. LRG (SDSS DR5) 
phot. LRG (2SLAQ) 
phot. LRG (AAf2) 


0.1' 


- 60' 


0.35 
0.55 
0.68 


autocorr. 


1.07 ±0.01 
1.01 ±0.01 
0.96 ±0.01 


Sawangwit et al. (2011) 


quasars(2SLAQ)-LRG(2SLAQ) 
quasars(2QZ)-LRG(2SLAQ) 
quasars(DR5)-LRG(2SLAQ) 


0.1' - 


- 100' 


ft 0.5 


cross-corr. 


0.7 ±0.2 
0.7 ±0.2 
0.8 ±0.1 


Mountrichas et al. (2009) 


XMM 0.5-2 keV sources 
XMM 2-10 keV sources 
XMM 0.5-4.5 keV sources 


~ 0.8' 


- 17' 


0.96 
0.94 
0.77 


autocorr. 


1.12 ±0.04 

1 no + 0.10 

1 -°°-0.11 
1 47+0.43 


Ebrero et al. (2009) 


XMM-LSS 0.5-2 keV sources 
XMM-LSS 2-10 keV sources 


~ 0.33' 


- 200' 


ft 1.1 
ft! 1.0 


autocorr. 


0.81 ±0.02 
1.00 ±0.04 


Elviv et al. (2011) 
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